Customer Segment Estimation Apparatus

ABSTRACT

In order to obtain customer state transition probabilities and short-term rewards conditioned by actions, customer behaviors are modeled with a hidden Markov model (HMM) using composite states each composed of a pair of a customer sate and a marketing action. Parameters of the estimated hidden Markov model (the composite state transition probabilities and a reward distribution for each composite state) are further transformed into the customer state transition probabilities and the distribution of rewards for each customer state conditioned by marketing actions. In order to model purchase properties in more detail, a time interval between purchases (called an inter-purchase time, below) is always included as an element in the customer state vector, thereby allowing the customer state to have information on the probability distribution of the inter-purchase time.

FIELD OF THE INVENTION

The present invention relates to a customer segment estimationapparatus. More precisely, the present invention relates to anapparatus, a method and a program for estimating a customer segment inconsideration of marketing actions.

BACKGROUND OF THE INVENTION

In direct marketing targeted at individual customers, there has beendemand for maximization of the total value of profits gained fromindividual customers throughout their lifetime (customer lifetime value:customer equity). To attain this, an important task in marketing is torecognize (i) how customer's behavior characteristics change over timeand (ii) how to guide customer's behavior characteristics in order toincrease profits of a company (i.e., to select the most suitablemarketing action).

As a conventional maximization method for maximizing a customer lifetimevalue by using marketing actions, there have been a method using aMarkov decision process (hereinafter, abbreviated as MDP) and a methodusing reinforcement learning (hereinafter, abbreviated as RL). The MDPmethod has a greater advantage in making a marketing strategy since itconsiders customer segments from a broader perspective.

In a case of using the MDP method, it is necessary to define customerstates with Markov properties. However, the definitions of the customerstates with Markov properties are not clear to humans in general. Forthis reason, there is a need for a tool for automatically determiningdefinitions of customer states that satisfy Markov properties using onlycustomer purchase data and marketing action data. The tool has afunction of automatically defining M customer states satisfying Markovproperties, when the number M of customer states is designated. Inaddition, the tool also has a function of providing transitionprobabilities from a customer state to other customer states with thestrongest Markov properties among the ones discretely representing Mcustomer states, and also providing a reward distribution from thecustomer states. The reward probability and the transition probabilitiesmust be conditioned by marketing actions.

With a conventional technique, a hidden Markov model (hereinafter,abbreviated as HMM) is used for learning customer states with Markovproperties. Examples of this have been proposed in Netzer, 0., J. M.Lattin, and V. Srinivasan (2005, July), A Hidden Markov Model ofCustomer Relationship Dynamics, Standford GSB Research Paper, andRamaswamy, V. (1997), Evolutionary preference segmentation with panelsurvey data: An application to new products, International Journal ofResearch in Marketing 14, 57-80.

By use of the aforementioned conventional techniques, however, it hasnot been possible to define customer states in consideration ofmarketing actions, or to find out parameters that can be inputted to anMDP. Although Netzer, et al take into consideration short-term/long-termeffects of marketing actions, its functional form is limited, so thatsuch effects cannot be practically inputted to the MDP. On the otherhand, Ramaswamy attempts to make definitions of customer states reflecteffects of marketing actions from the beginning.

SUMMARY OF THE INVENTION

In consideration of the foregoing problems, an object of the presentinvention is to define customer states with Markov properties withconsideration of marketing actions that can be inputted to an MDP, andto obtain, as parameters of customer state, information on what kinds ofeffects marketing actions produce.

A first aspect of the present invention is to provide the followingsolving means.

The first aspect provides an apparatus for estimating a customer segmentresponding to a marketing action. The apparatus includes: an input unitfor receiving customer purchase data obtained by accumulating purchaserecords of a plurality of customers, and marketing action data onactions taken on each of the customers; a feature vector generation unitfor generating time series data of a feature vector composed of a pairof the customer purchase data and the marketing action data; an HMMparameter estimation unit for outputting distribution parameters of ahidden Markov model based on the time series data of the feature vectorand the number of customer segments, for each composite state composedof a customer state classified by customer purchase characteristic andan action state classified by effect of a marketing action; and astate-action break-down unit for transforming the distributionparameters into parameter information for each customer segment.

More precisely, in order to estimate a customer segment (classificationof customers, for example, classification of a high-profit customersegment, a medium-profit customer segment, a low-profit customer segmentand the like) responding to a market action taken by a company, theapparatus receives an input of the customer purchase data, in whichpurchase records of the plurality of customers are accumulated, and themarketing action data of actions having been taken on each of thecustomers. Then, (i) the feature vector generation unit generates thetime series data of the feature vector composed of a pair of theinputted customer purchase data and marketing action data. Next, (ii)the HMM parameter estimation unit outputs the distribution parameters ofthe hidden Markov model (HMM) based on the time series data of thefeature vector outputted in (i), and the number of customer segments(additionally inputted), for each “composite state” composed of a pairof the “customer state” classified by purchase characteristic of acustomer, and the “action state” classified by effect of a marketingaction. At last, (iii) the state-action break-down unit transforms thedistribution parameters into the parameter information (customer segmentinformation) per customer segment. The outputted customer segmentinformation can be used as MDP parameters.

Moreover, in an additional aspect of the present invention, the customerpurchase data contain an identification number of a customer, a purchasedate of the customer and a vector of a transaction made by the customerat the purchase date. In addition, the time series data of the featurevector are vector data in which information containing sales/profitsproduced in each purchase transaction and an inter-purchase time areassociated as a pair with a marketing action related to the purchasetransaction. The marketing action data contain the number of a customertargeted by a market action, a purchase date estimated as when thecustomer makes a purchase possibly because of an effect of the marketaction, and a vector of a marketing action taken at the purchase date.

Furthermore, the distribution parameters include probabilitydistributions of sales/profits, inter-purchase times and marketingactions, which are different among composite states, and transitionrates of continuous-time Markov processes each indicating a transitionfrom a composite state to another composite state. The parameterinformation for each customer segment contains transition probabilitiesfrom a customer state to other customer states (hereinafter, simplycalled customer state transition probabilities) and short-term rewards.The state-action break-down unit receives, as an input, a time intervaldetermined for marketing actions (for example, one month when campaignsare made every second month).

In addition to providing an apparatus having the foregoing functions,other aspects of the present invention provide a method for controllingsuch an apparatus, and a computer program for implementing the method ona computer.

In restating the summary of the present invention, the aforementionedproblem can be solved mainly by using the following ideas. Precisely, inorder to obtain the customer state transition probabilities andshort-term rewards conditioned by actions, customer behaviors aremodeled with a hidden Markov model (HMM) using composite states eachcomposed of a pair of a customer sate and a marketing action. Theparameters of the estimated hidden Markov model (the composite statetransition probabilities and a reward distribution for each compositestate) are further transformed into the customer state transitionprobabilities and the distribution of rewards for each customer stateconditioned by marketing actions.

Furthermore, in order to model purchase characteristics in more detail,the customer state vector should always include a time interval betweenpurchases (hereinafter, referred to as an inter-purchase time) as anelement, thereby allowing the customer state to have information on theprobability distribution of the inter-purchase times. Then, the problemsare solved by combining the following three procedures.

(A) To generate time series data of a feature vector composed of acombination (pair) of a customer state and a marketing action taken by acompany at this time;

(B) To output parameters of a hidden Markov model to which the generatedtime series data of the feature vector are inputted as observed results.The outputted parameters are parameters defined per composite statecomposed of a customer state and a marketing action, and thecomposite-state transition probabilities. In other words, theseparameters incorporate information not only on how a customer state haschanged, but also on how the company has changed its own actions.

(C) To compute the customer state transition probabilities andshort-term rewards conditioned by marketing actions, by using theobtained parameters of the HMM as inputs. These can be used as MDPparameters, and thereby can be used to maximize long-term profit.

It should be noted that, unless action data of the company are inputtedin (A), the composite state in (B) does not contain information onaction changes of the company, which does not allow the information onthe transition probabilities obtained in (C) to be different from eachother among the marketing actions. In addition, if the procedure (C) isnot performed, the parameters obtained at a time of completing (B)indicate unnecessary information on how company's actions changes(though future company's actions should be selected while beingoptimized from a company's viewpoint), so that there is no effective wayof using these parameters. Accordingly, a characteristic of the presentinvention is to combine the three procedures (A), (B) and (C).

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present invention and theadvantage thereof, reference is now made to the following descriptiontaken in conjunction with the accompanying drawings.

FIG. 1 shows a functional configuration of a customer segment estimationapparatus 10 according to an embodiment of the present invention;

FIG. 2 shows a concept of time series data of vectors each composed of apair of customer behavior and marketing action generated by a featurevector generation unit 11;

FIG. 3 shows changes over time of feature vectors as transitions betweendiscrete composite states in an HMM parameter estimation unit 12;

FIG. 4 shows how to define a discrete customer state and an action stateby factorizing each composite state into both of the axial directions ina state-action break-down unit 13;

FIG. 5 is a diagram showing that a state-action break-down unit 13computes a rate at which a composite state composed of a combination ofdifferent customer state and action state belongs to each of knowncomposites states;

FIG. 6 shows that the state-action break-down unit 13 computes, by usingthe probabilities of belonging to the composite states, a transitionprobability with which an arbitrary customer state transits to anothercustomer state when an arbitrary marketing action is taken thereon;

FIG. 7 shows that the state-action break-down unit 13 computes, by usingthe probabilities of belonging to the composite states, rewards(profits) obtained between arbitrary customer states when an arbitraryaction is taken;

FIG. 8 shows that the transition probability and reward distributionobtained by the state-action break-down unit 13 are MDP parameters;

FIG. 9 shows a generation example of feature vector time series data 23in an example;

FIG. 10 shows a screen displaying parameters obtained by a state-actionbreak-down unit 13 in the example;

FIG. 11 shows additional information to be displayed on the screen inFIG. 10; and

FIG. 12 is a diagram showing a hardware configuration of a customersegment estimation apparatus 10 of an embodiment of the presentinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

According to the present invention, it is possible to examine what kindsof short-term and long-term effects marketing actions produce inaccordance with customer states, and thereby to select the most suitablemarketing actions in consideration of the customer states.

Hereinafter, embodiments of the present invention will be described withreference to the drawings.

FIG. 1 is a diagram showing a functional configuration of a customersegment estimation apparatus 10 according to an embodiment of thepresent invention. As shown in FIG. 1, the apparatus 10 includes threecomputation units called a feature vector generation unit 11, an HMMparameter estimation unit 12 and a state-action break-down unit 13. Inaddition, units indicated by reference numerals 21 to 26 are datainputted to or outputted from the computation units, or storage unitsfor storing the data therein.

Note that, although the storage units of customer purchase data 21 andmarketing action data 22 are provided in the apparatus 10 in FIG. 1,these data may be inputted from the outside through a network. Moreover,the number of customer segments 24 may be inputted by an operatordirectly, or by an external system. The apparatus 10 may also includeinput units such as a key board and a mouse, a display unit such as anLCD or a CRT, and a communication unit as a network interface.Hereinafter, general descriptions will be provided for the featurevector generation unit 11, the HMM parameter estimation unit 12, thestate-action break-down unit 13 with reference to FIG. 1 together withFIGS. 2 to 8.

<Feature Vector Generation Unit 11>

The feature vector generation unit 11 processes original data in orderto apply the original data to the hidden Markov model of the presentinvention. The feature vector generation unit 11 generates vector datafrom the customer purchase data 21 and the marketing action data 22. Inthe vector data, information on sales/profits and the like generated pertransaction and inter-purchase times are associated as a pair withmarketing actions related to the transactions. In this way, featurevector time series data 23 are generated.

FIG. 2 is a conceptual diagram of time series data of vectors eachcomposed of a set of a customer behavior and a marketing action. In FIG.2, the vertical axis indicates customer behaviors such as profit, salesand a mail response rate, and the horizontal axis indicates marketingactions (actions carried out by a company). This example shows howsamples of January (indicated by ) transit to samples of February(indicated by ◯).

<HMM Parameter Estimation Unit 12>

The HMM parameter estimation unit 12 estimates distribution parameters25 of a purchase model of the present invention from the feature vectortime series data 23. For this estimation, the desired number of customersegments 24 is designated from the outside. Alternatively, the number ofcustomer segments itself can also be optimized by using the designatedvalue as an initial value. With respect to each discrete composite statecalled a state-action pair, the distribution parameters 25 include (i)probability distributions (of sales/profits, inter-purchase times andmarketing actions) that is different from those of other compositestates, and (ii) transition rates of continuous-time Markov processesindicating transitions between composite states.

FIG. 3 shows changes over time of such feature vectors as transitionsbetween discrete composite states. The composite states are obtained byclassifying sets of customer behavior and marketing action into severalcategories, and are here expressed as z₁, z₂ and z₃. Detaileddescriptions of the composite state will be provided later. Note that acomposite state after the foregoing processing still containsmeaningless information on “how company behaviors change.”

<State-Action Break-down Unit 13>

The state-action break-down unit 13 converts the distribution parameters25 per composite state obtained by the HMM parameter estimation unit 12,into parameters (customer segment information 26) of each customersegment that indicates original characteristics of customers. Thestate-action break-down unit 13 receives an input of a time intervaldetermined for marketing actions 27 (for example, a period for acampaign if the campaign is made), and outputs (i) probabilitydistributions (of the sales/profits and inter-purchase time) for each ofthe customer segments, and (ii) customer segment transitionprobabilities. In addition, the parameters (i) and (ii) are functions ofmarketing action. The parameters obtained by the state-action break-downunit 13 can be inputted to the MDP. Otherwise, the parameters may not beinputted to the MDP, but can be used for finding which customer segmenttends to respond to what kind of action.

FIGS. 4 to 8 conceptually explain processing in the state-actionbreak-down unit 13. FIG. 4 shows how to define a discrete customer stateand an action state by factorizing each composite state into both of theaxial directions. Here, composite states z₁, z₂ and z₃ are factorizedinto customer states s₁, s₂ and s₃ and action states d₁, d₂ and d₃,respectively. The customer state, the action state and the compositestate will be described below.

The customer state s is one of several kinds of classes into whichcustomer characteristics are classified. Here, the customercharacteristics indicate, for example, how much money a customer islikely to spend at a shop and how often a customer is likely to visit ashop. For instance, assume that, given combinations of sales andpurchase frequency as customer characteristics, the combination isclassified into 4 classes. In this case, a possible classificationincludes the following 4 classes: s₁=(high sales and high visitingfrequency), s₂=(high sales but low visiting frequency), s₃=(low sales,but high visiting frequency), and s₄=(low sales and low visitingfrequency). In practice, such a classification must not be determinedsubjectively, but must be determined on the basis of data.

The action state d is one of several kinds of classes into whichcombinations of variables taken as market actions are classifiedaccording to effects of the market actions. For example, taking pricingas an example of the market actions, assume that the pricing isclassified into three classes according to the effect thereof. At thistime, three classes such as d₁=cheap, d₂=normal and d₃=expensive may beused for classification. The action state must not be also determinedsubjectively, but must be determined on the basis of data.

The composite state z is one of several classes into which combinationsof a customer characteristic and marketing action taken by the companyare classified. For example, given that the customer characteristic is apurchase price, and that the marketing action is a price, a possibleclassification example of the states (composite states) each indicatinga combination of a customer characteristic and a company behaviorincludes z₁=(a high price is presented to a high-sales customer), z₂=(alow price is presented to a high-sales customer), z₃=(a high price ispresented to a low-sales customer) and z₄=(a low price is presented to alow-sales customer). Such classification must also be determined on thebasis of data, especially on the basis of a change in the customercharacteristic thereafter.

FIG. 5 is a diagram showing that it is possible to compute and thus finda rate at which an arbitrary composite state of a combination of adifferent customer state and action state belongs to each of the knowncomposite states. Here, as an example, by use of statistical processing,found is a probability that a combination (s₁, d₃) of a differentcustomer state and action state belongs to each of the composite statesz₁, z₂ and z₃. The found probabilities of z₁ (s₁, d₁), z₂ (s₂, d₂) andz₃ (s₃, d₃) are 30%, 25% and 45%, respectively.

FIG. 6 shows that customer state transition probabilities are computedwith the probabilities of belonging to the composite states, when anarbitrary marketing action is taken on an arbitrary customer state. InFIG. 6, assuming that the action of the action state d₃ is taken on thecustomer state s₁, a transition probability from the customer state s₁to each of the customer states is computed. An oval 60 surrounding (s₁,d₃) indicates that the action of the action state d₃ is taken on thecustomer states s₁. Horizontally long ovals 61, 62 and 63 indicate thecustomer states s₁, s₂ and s₃. Each of the ovals 61, 62 and 63 is evenlydistributed and extends uniformly along the horizontal axis, since thecustomer state does not contain the information on marketing action.Accordingly, the computation here aims to find out which point in whichoval of s₁, s₂ and s₃ a point existing in the oval (s₁, d₃) is likely totransit to.

This computation uses the composite state transition probabilities, andthe probabilities that the customer state s₁ belongs to composite statesz_(m) when the action of the action state d₃ is taken on the customerstate s₁. Here, the composite state transition probabilities are alreadycomputed by the HMM parameter estimation unit 12. In addition, theprobability that the customer state s₁ belongs to each of the compositestates z_(m) when the action of the action state d₃ is taken is computedfor each of the composite states z_(m) in the method shown in FIG. 5.For example, the probability that the customer state s₁ transits to thecustomer state s₂ when the action of the action state d₃ is taken on thecustomer state s₁ is computed by adding up the values obtained bymultiplying the following two probabilities in regard to each of thecomposite states z_(m). Specifically, one of the probabilities is thatthe composite state z₂ is generated from each of the composite statesz_(m), and the other is that the customer state s₁ belongs to each ofthe composite states z_(m) when the action of the action state d₃ istaken on the customer state s₁.

FIG. 7 shows that rewards (profits) obtained from arbitrary customerstates when an arbitrary action is taken is computed by using theprobabilities of belonging to the composite states. In FIG. 7, computedis the distribution of profits obtained when the action of action stated₃ is taken on the customer state s₁. The differences among thedistributions of profits obtained from the customer states are known,and reflected in distribution profiles shown on the left side of FIG. 7.Accordingly, a desired distribution can be obtained if which rates to beused are known in order for all the distributions to be combinedtogether. The combining rates are computed in the method shown in FIG.5, as the probability that the customer state s₁ belongs to each of thecomposite states z_(m) when the action of the action state d₃ is takenthereon. Hence, an asymmetrical distribution shown in a center part ofFIG. 7 can be obtained by using these combining rates.

FIG. 8 shows that the obtained transition probabilities and rewarddistribution are MDP parameters. Here, the following probabilities anddistribution are figured out when the action of the action state a₃ istaken on the customer state s₁: the probabilities that the customerstate s₁ transits to s₂ and s₃; the probability that the customer states₁ stays at s₁; and the reward (profit) distribution.

Hereinafter, detailed descriptions will be provided for a more specificcomputation method used in the aforementioned feature vector generationunit 11, HMM parameter estimation unit 12 and state-action break-downunit 13.

[Feature Vector Generation Unit 11]

To the feature vector generation unit 11, customer purchase data andmarketing action data are inputted. The customer purchase data include:an index c ∈ C (where C is a set of customers) indicating a customernumber; t_(c, n) indicating a date when a customer c makes an n-thpurchase; and a reward vector r_(c, n) of rewards produced by thecustomer c on the date t_(c, n). Here, 1≦n≦N_(c) where N_(c) denotes thenumber of purchase transactions by the customer c. Any element can bedesignated as r_(c, n) as needed. Examples of such an element are ascalar quantity of a total value of sales of all products purchased onthe date, and a two-dimensional vector containing total values of salesof product categories A and B arranged side by side. Not only sales butalso a gross profit or an amount of used points of a promotion programmay be used as the reward vector. Hereinafter, the reward vectorr_(c, n) is simply referred to as a reward.

The marketing action data include:

(i) a customer number c ∈ C targeted by the marketing action,

(ii) a purchasing date t_(c, n) on which a customer makes a purchase,possibly because of the effect of the marketing action, and

(iii) a marketing action vector a_(c, n) carried out on the above datet_(c, n).

In a case where any information among the above is not available,interpolation is performed for the information as needed. As a_(c, n), ausable example is a discount rate of a product offered to the customer,a numerical value of bonus points provided to the customer according toa membership program, or a vector obtained by combining these twovalues. In addition, an action of “doing nothing” can also be defined bydetermining an action vector value corresponding to this action (forexample, all elements are 0). Hereinafter, the marketing action vectora_(c, n) will be simply referred to as an action.

The feature vector generation unit 11 generates and outputs thefollowing feature vector time series data 23 from the foregoing inputdata:

(i) a customer number c, and

(ii) a feature vector v_(c, n)=(r_(c, n), τ_(c, n), a_(c, n))^(T) in then-th transaction of the customer c.

( )^(T) indicates a transposed vector. Moreover,τ_(c, n)=t_(c, n+1)−t_(c, n), where τ_(c, n) denotes the inter-purchasetime of the n-th transaction. r_(c, n) and a_(c, n) satisfy 1≦n≦N_(c),and τ_(c, n) satisfies 1≦n≦N_(c)−1. In other words, the feature vectoris a vector consisting of a combination of (the reward and theinter-purchase time, and the action). Hereinafter, {r_(c, 1), r_(c, 2),. . . r_(c), N_(c)} is simply expressed as

r₁ ^(N) ^(c) .  [Formula 1]

Similarly,

a₁ ^(N) ^(c) , t₁ ^(N) ^(c) , τ₁ ^(N) ^(c) ⁻¹  [Formula 2]

are defined.

[HMM Parameter Estimation Unit 12] <Model and Overview>

The HMM parameter estimation unit 12 estimates parameters Q and Θ withthe number M of customer segments designated from input data,

D={υ _(c,n)=(r _(c,n),τ_(c,n),a_(c,n))^(τ) ,r _(c,N) _(c) ,a_(c,N) _(c);c ∈ C,1≦n≦N _(c)−1},  [Formula 3]

and then outputs the parameters.

The parameter Q={q_(ij); 1≦i, j≦M} is a parameter of a continuous-timeMarkov process called a generator matrix, and is an M×M matrix. Thisparameter indicates the degree of transition between latent statescalled composite states. The composite state is a state indicating apair of a latent customer segment and a latent marketing action segment.The parameter Θ={Θ_(m); 1≦m≦M} is a parameter showing the distributionof a feature vector assigned to each of the composite states. Θ_(m)denotes a distribution parameter contained in the composite state m.This parameter differs depending on what type of distribution of afeature vector is employed. The present invention does not limit thetype of distribution of a feature vector, but an example of the featurevector having normal distribution will be described later.

The HMM parameter estimation unit 12 figures out the model parameters Qand Θ used to express a log likelihood of learning data as the followingequations (1) and (2). There are several derivation methods for theseparameters, and the present invention is not limited to any of theparameter derivation methods. When the parameters maximizing the loglikelihood are figured out, a maximum likelihood estimation method isused, and, in practice, an Expectation Maximization Algorithm (EMalgorithm) is used. Only an example of this case will be describedlater. When the expected values in the posterior distributions ofparameters are figured out, a Bayesian inference method is used. In thiscase, practically, a variational Bayes method is used. Moreover, the HMMparameters can also be estimated by using a sampling method called aMonte Carlo Markov chain (MCMC).

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 4} \right\rbrack & \; \\{{L\left( {\left. D \middle| Q \right.,\Theta} \right)} = {\sum\limits_{c \in C}\; {\log {\sum\limits_{z_{1}^{N_{c}}}\; {P\left( {r_{1}^{N_{c}},t_{1}^{N_{c}},a_{1}^{N_{c}},\left. z_{1}^{N_{c}} \middle| Q \right.,\Theta} \right)}}}}} & (1) \\{{P\left( {r_{1}^{N_{c}},t_{1}^{N_{c}},a_{1}^{N_{c}},\left. z_{1}^{N_{c}} \middle| Q \right.,\Theta} \right)} = {{P\left( z_{c,1} \middle| t_{c,1} \right)}{\prod\limits_{n = 1}^{N_{c} - 1}\; {{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{z_{c,n}} \right.} \right)}{P\left( {\left. z_{c,{n + 1}} \middle| z_{c,n} \right.,\tau_{c,n},Q} \right)}{F\left( {r_{N_{c}},\left. a_{N_{c}} \middle| \Theta_{{zN}_{c}} \right.} \right)}}}}} & (2)\end{matrix}$

In the equations (1) and (2), z_(c, n) is the composite state generatingthe feature vector v_(c, n) of the n-th transaction of the customer c,and takes a value within a range of 1≦z,_(c, n) ≦M. In addition, wedenote a sequence of the composite states z₁ ^(Nc) as

z₁ ^(N) ^(c) =z_(c,1),z_(c,2), . . . z_(N) _(c) .  [Formula 5]

The equation (1) expresses the expected value of the probability ofoutputting a feature vector of a time series of all latent states thatcould occur. P(z_(c, n+1)|z_(c, n), τ_(c, n), Q) indicates theprobability that, given the generator matrix Q, the latent statez_(c, n) of the customer c transits to the latent state z_(c, n+1) whena τ_(c, n) time elapses after the customer c makes a purchase at a timet_(c, n). F(·|Θ_(m)) denotes the probability density function ofoutputting the feature vector designated in the latent state m.

P(z_(c, 1)|t_(c, 1)) denotes the probability of an initial state of thecustomer c at a time t_(c, 1). If the number of times that the customermakes a purchase is sufficiently great, the influence of the probabilityof the initial state can be ignored. For simplification, assume that theinitial states of all the customers c ∈ C are the same at a firstpurchase date t_(c, 1).

<Algorithm>

Here, descriptions will be given for an EM algorithm based on maximumlikelihood estimation as an example of a practical method of estimatingthe HMM parameters. This estimation method is just an example of theapplication of the present invention. When the maximum likelihoodestimation is used as a framework, the log likelihood is transformedinto the following equation (3).

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 6} \right\rbrack & \; \\{{L\left( {\left. D \middle| Q \right.,\Theta} \right)} = {\sum\limits_{c \in C}\; {\log {\sum\limits_{i}\; {\sum\limits_{j}\; {{\alpha_{c,n}(i)}{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}{\beta_{c,{n + 1}}(j)}}}}}}} & (3) \\{{\alpha_{c,1}(i)} = {P\left( i \middle| t_{c,1} \right)}} & (4) \\{{\alpha_{c,{n + 1}}(j)} \propto {\sum\limits_{i}\; {{\alpha_{c,n}(i)}{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}}}} & (5) \\{{\beta_{c,N_{c}}(i)} = 1} & (6) \\{{\beta_{c,n}(i)} = {\sum\limits_{j}\; {{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}{\beta_{c,{n + 1}}(j)}}}} & (7)\end{matrix}$

α_(c, n+1)(j) is referred to as the forward probability, and indicatesthe probability P(j|v_(c, 1), . . . , v_(c, n)) that, given the featurevector v_(c, 1), v_(c, 2), . . . , v_(c, n), the customer c is in thelatent state j at the time t_(c, n+1). This forward probabilitysatisfies

Σ_(j)α_(c,n+1)=1.  [Formula 7]

β_(c, n)(i) is referred to as the backward probability, and indicatesthe probability

P(υ_(c,n+1), . . . , υ_(c,N) _(c) |i)  [Formula 9]

that a feature vector

υ_(c,n+1), υ_(c,n+2), . . . υ_(c,N) _(c)   [Formula 8]

is generated from the latent state i. α_(c, n+1)(j) β_(c, n)(i) can berecursively computed by using the formulas (5) and (7).

In order to use the EM algorithm, the infimum of the equation (3) isfigured out by using the Jensen's inequality. At this time, a new latentvariable

u^(ij) _(c,n)  [Formula 10]

is introduced. This variable indicates the probability of an occurrenceof the transition probability that the latent state i transits to thelatent state j at a period [t_(c, n), t_(c, n+1)]. When the latentvariable is introduced, the estimation algorithm is expressed asfollows.

<E-step:>

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 11} \right\rbrack & \; \\{{\alpha_{c,1}(i)} = {P\left( i \middle| t_{c,1} \right)}} & (8) \\{{\alpha_{c,{n + 1}}(j)} \propto {\sum\limits_{i}\; {{\alpha_{c,n}(i)}{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}}}} & (9) \\{{\beta_{c,N_{c}}(i)} = 1} & (10) \\{{\beta_{c,n}(i)} = {\sum\limits_{j}\; {{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \Theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}{\beta_{c,{n + 1}}(j)}}}} & (11) \\{u_{c,n}^{ij} \propto {{\alpha_{c,n}(i)}{F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \theta_{i} \right.} \right)}{P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}{\beta_{c,{n + 1}}(j)}}} & (12)\end{matrix}$

<M-step:>

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 12} \right\rbrack & \; \\{{P\left( i \middle| t_{c,1} \right)} \propto {\sum\limits_{c \in C}\; {\alpha_{c,1}(i)}}} & (13) \\{\theta_{i} = {\arg \; {\max\limits_{\theta_{l_{i}}}{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}{\left( {\sum\limits_{j}\; u_{c,n}^{ij}} \right)\log \; {F\left( {r_{c,n},\tau_{c,n},\left. a_{c,n} \middle| \theta_{i} \right.} \right)}}}}}}} & (14) \\{Q = {\arg \; {\max\limits_{Q}{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}\; {\sum\limits_{i}\; {\sum\limits_{j}\; {u_{c,n}^{ij}\log \; {P\left( {\left. j \middle| i \right.,\tau_{c,n},Q} \right)}}}}}}}}} & (15)\end{matrix}$

1. Set proper initial values for the parameters Q and Θ, or for thelatent variable

{u^(ij) _(c,n);c ∈ C,1≦n≦N_(c),1≦i,j≦M}  [Formula 13]

2. Repeat the above E-step and M-step until the parameters converge.

In practice, the above estimation algorithm cannot be implemented unlessthe distribution of a feature vector and a model of the latent statetransition probability are not specified. However, this distribution canbe freely selected at user's own discretion. Accordingly, here, shown isonly one example in which a normal distribution is used for the featurevector. When the normal distribution is used for the feature vector, intaking it in consideration that the inter-purchase time always takes apositive real number, the latent state is determined so that theinter-purchase time would follow lognormal distribution, and that theother feature vector quantities follow the normal distribution.Specifically, the latent state is modeled by using the equation

F(r _(c,n),τ_(c,n)|θ_(m))=N(r _(c,n), log τ_(c,n) ,a _(c,n);μ_(m),Σ_(m))(16),  [Formula 14]

and by using Θ_(m)={μ_(m); Σ_(m)} as the parameter Θ_(m) in practice. Inaddition, the latent state is expressed as the following equation,

χ_(c,n)=(r _(c,n), log τ_(c,n) ,a _(c,n))^(T).  [Formula 15]

Moreover, the latent state transition probability should correspond to acontinuous-time Markov process. However, in consideration of acomputation time and characteristics of proper customer segments, thetransition probability is approximated as shown in an equation (17).This equation is established on the assumption that the latent statedoes not change as rapidly as the inter-purchase time τ. Since learningof a customer segment whose customer state changes rapidly betweensuccessive purchase data is useless in practice, such an assumption isemployed.

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 16} \right\rbrack & \; \\{{P\left( {\left. j \middle| i \right.,\tau,Q} \right)} = \left\{ {\begin{matrix}\frac{1}{1 + {\lambda_{i}\tau}} & {{{if}\mspace{14mu} j} = i} \\{\frac{\lambda_{i}\tau}{1 + {\lambda_{i}\tau}}p_{ij}} & {{{if}\mspace{14mu} j} \neq i}\end{matrix},} \right.} & (17)\end{matrix}$

where Q={q_(ij); 1≦i, j≦M} is expressed using a parameter

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 17} \right\rbrack & \; \\{q_{ij} = \left\{ {\begin{matrix}{- \lambda_{i}} & {{{if}\mspace{14mu} j} = i} \\{\lambda_{i}p_{ij}} & {{{if}\mspace{14mu} j} \neq i}\end{matrix}.} \right.} & (18)\end{matrix}$

On the above assumption, the equation (14) of the foregoing M-step isequivalent to equations (19) and (20), and the equation (15) thereof isequivalent to equations (21) and (22).

$\begin{matrix}\left\lbrack {{Formula}\mspace{20mu} 18} \right\rbrack & \; \\{\mu_{i} = \frac{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}\; {\left( {\sum\limits_{j}\; u_{c,n}^{ij}} \right)x_{c,n}}}}{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}\; \left( {\sum\limits_{j}\; u_{c,n}^{ij}} \right)}}} & (19) \\{\sum\limits_{i}{= \frac{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}\; {\left( {\sum\limits_{j}\; u_{c,n}^{ij}} \right)\left( {\chi_{c,n} - \mu_{i}} \right)\left( {\chi_{c,n} - \mu_{i}} \right)^{T}}}}{\sum\limits_{c \in C}\; {\sum\limits_{n = 1}^{N_{c} - 1}\; \left( {\sum\limits_{j}\; u_{c,n}^{ij}} \right)}}}} & (20)\end{matrix}$

It is necessary to find a solution of the equation (21) by using aone-dimensional Newton-Raphson method for each λ_(i). In practice,however, by using

$\begin{matrix}{{\lambda_{i}\tau_{n}\mspace{11mu} \text{<<}\mspace{11mu} 1},{\frac{1}{1 + {\lambda_{i}\tau_{c,n}}} \cong {1 - {\lambda_{i}\tau_{c,n}}}},} & \left\lbrack {{Formula}\mspace{14mu} 19} \right\rbrack\end{matrix}$

the equation (21) can be computed from an equation (23).

$\begin{matrix}\text{[Formula~~20]} & \; \\{\lambda_{i} = \frac{\sum\limits_{c \in C}{\sum\limits_{n = 1}^{N_{c} - 1}{\sum\limits_{j \neq i}u_{c,n}^{ij}}}}{\sum\limits_{c \in C}{\sum\limits_{n = 1}^{N_{c} - 1}{\tau_{c,n}{\sum\limits_{j}u_{c,n}^{ij}}}}}} & (23)\end{matrix}$

In the case of using the equation (23), when the parameter becomes closeto the local solution, the likelihood does not monotonously increase,but fluctuates up and down. For this reason, the executing of theiteration algorithm is stopped when the fluctuation starts, or theNewton-Raphson method is used after the fluctuation starts.

[State-Action Break-down Unit 13]

The state-action break-down unit 13 transforms the parameters Q and Θoutputted by the HMM parameter estimation unit 12, receives an input ofthe time interval determined for marketing actions, and outputs theparameter of the discrete-time Markov Decision Process defined by Mkinds of discrete customer states and M kinds of discrete action states.Both the customer states (=the reward and inter-purchase time) and theaction states essentially take continuous values. However, by expressingeach of the parameters as a linear combination of the parameter definedin a form of a limited number of discrete values, the solutions of theparameters can be found by using the MDP, in reality. The outputtedparameters are as follows:

-   the parameter of the distribution of probability P(r, τ|s_(i)) that    a reward r and an inter-purchase time T are generated from a    customer state s_(i).-   the parameter of the distribution of probability P(a|d_(j)) that an    action vector a is generated from an action state d_(j).-   the probability λ_(m)(i, j) that a set (s_(i), d_(j)) of the    customer state s_(i) and the action state d_(j) belongs to the    composite state z_(m).-   the probability P_(τ)(s_(k)|s_(i), d_(j)) that a customer in the    customer state s_(i) changes the state to a customer state s_(k)    when a time τ elapses after an action belonging to the action state    d_(j) is taken on the customer.-   the parameter of the distribution of probability P(r, τ|s_(i),    d_(j)) of observing the reward r and inter-purchase time τ after an    action belonging to the action state d_(j) is taken on the customer    in the customer state s_(i).

Note that τ in P_(τ)(s_(k)|s_(i), d_(j)) is manually given inconsideration of an interval between campaign implementations (that is,a time interval to be used for optimization through the MDP).

A point of the state-action break-down unit 13 is to compute a rate atwhich a set of the i-th customer state s_(i) and the j-th action stated_(j) belongs to each of the composite states z_(m) learned by the HMMparameter estimation unit 12. In short, the point is to compute λ_(m)(i,j) described above. According to the present invention, all of thereward, the inter-purchase time and the action vector are determinedonly stochastically. For this reason, even when the above set is in thei-th customer state s_(i), the set stochastically belongs to all thecomposite states z_(m). Similarly, even when the set is in the j-thaction state d_(j), the set stochastically belongs to all the compositestates z_(m).

Firstly, the definitions of the customer state and action state aregiven. The reward and inter-purchase time are generated from thecustomer state, and the action vector is generated from the actionstate. Accordingly, the customer state s_(i) and the action state d_(j)are defined as equations (24) and (25), respectively. Note that acorrelation between the reward and action vector is lost by making thedecomposition as shown in the equations (24) and (25).

P(r,τ|s _(i))=∫_(a) P(r,τ,a|z _(i))da(24)   [Formula 21]

P(a|d _(j))=∫_(r)∫_(τ) P(r,τ,a|z _(j))drd τ  (25)

Next, the state-action break-down unit 13 determines a rate at which thecomposite state (s_(i), d_(j)) defined in the equations (24) and (25)belongs to each of the composite states z_(m) with respect to i, j,respectively. This can be solved firstly by calculating the distancebetween the feature vector distribution P(v|s_(i), d_(j))=P(r, τ|s_(i))P(a|d_(j)), and the feature vector distribution P(v|z_(m)) of each knowncomposite state, and then by calculating a reciprocal ratio among thedistances. An arbitrary measure depending on the case can be used forthis distance measure, and this example employs the Mahalanobis distancebetween the average value of P(v|s_(i), d_(j))=P(r, τ|s_(i)) P(a|d_(j))and P(v|z_(m)). Assuming that d(·, ·) denotes the distance measurebetween the distributions, and that λ_(m)(i, j) denotes the probabilitythat, given the customer state s_(i) and the action state d_(j), the setthereof belongs to the composite states z_(m),

p≡P(r,τ|s _(i))P(a|d _(j))(26)  [Formula 22]

q _(m) ≡P(r,τ,a|z _(m))  (27)

λ_(m)(i,j)∝1/d(p,q _(m))  (28).

The parameters for the MDP are figured out from the proportionalexpression (28). Firstly, descriptions will be given for a procedure offiguring out the probability P_(τ)(s_(k)|s_(i), d_(j)) that the customerstate s_(i) transits to the customer state s_(k) when the time τ elapsesafter the action d_(j) is taken on the customer state s_(i). Here,transitions to all the possible composite states to which the customerstate s_(i)/action state d_(j) would belong are considered, and then theprobability of obtaining the customer state s_(k) from the compositestates after the transitions is considered. Thus, the probability isexpressed as

$\begin{matrix}\text{[Formula~~23]} & \; \\{{P_{\tau}\left( {\left. s_{k} \middle| s_{i} \right.,d_{j}} \right)} = {\sum\limits_{z_{1}}{\sum\limits_{z_{2}}{{P\left( s_{k} \middle| z_{2} \right)}{P_{\tau}\left( z_{2} \middle| z_{1} \right)}{{P\left( {\left. z_{1} \middle| s_{i} \right.,d_{j}} \right)}.}}}}} & (29)\end{matrix}$

Paying attention to the fact that the customer state s_(k) is figuredout by integrating all information on the actions by using the equation(24), it practically suffices to regard P(s_(k)|z₂) as 1 only when k=z₂,and as 0 otherwise (if more exact calculating is needed, Bayes' theoremmay be used). As a result,

$\begin{matrix}\text{[Formula~~24]} & \; \\{{P_{\tau}\left( {\left. s_{k} \middle| s_{i} \right.,d_{j}} \right)} = {\sum\limits_{m}{{P_{\tau}\left( k \middle| m \right)}{{\lambda_{m}\left( {i,j} \right)}.}}}} & (30)\end{matrix}$

Subsequently, descriptions will be given for a procedure of figuring outthe distribution P(r, τ|s_(i), d_(j)) of the reward/inter-purchase timeto be obtained when the action of the action state d_(j) is taken on thecustomer state s_(i). To figure out this, the distribution (ofreward/purchase time) at a time when a composite state and an actionvector a are given is needed firstly, and this can be figured out froman equation (31).

$\begin{matrix}\text{[Formula~~25]} & \; \\{{P\left( {r,\left. \tau \middle| z_{m} \right.,a} \right)} = \frac{P\left( {r,\tau,\left. a \middle| z_{m} \right.} \right)}{\int_{r}{\int_{\tau}{{P\left( {r,\tau,\left. a \middle| z_{m} \right.} \right)}{r}{\tau}}}}} & (31)\end{matrix}$

There are two possible methods of figuring out P(r, τ|s_(i), d_(j)), anduse of the methods results in two cases where the mixed distributionusing rates of λ_(m)(i, j) is obtained, and where the distribution inwhich parameters are mixed at rates of λ_(m)(i, j) is obtained. Theformer mixed distribution is expressed as

$\begin{matrix}\text{[Formula~~26]} & \; \\{{P\left( {r,\left. \tau \middle| s_{i} \right.,d_{j}} \right)} = {\int_{a}{\sum\limits_{m}{{P\left( {r,\left. \tau \middle| z_{m} \right.,a} \right)}{\lambda_{m}\left( {i,j} \right)}{P\left( a \middle| d_{j} \right)}{{a}.}}}}} & (32)\end{matrix}$

In the latter case, a specific example will be described later because amixture of parameters is carried out in the parameter region. Since theforgoing formulas contain many integral computations, one may considerthat it takes a long time to compute them. In practice, however, if adistribution that can be analytically easily tractable (for example: amultivariate normal distribution) is selected for the distribution ofthe feature vector, these formulas can be analytically solved. Actuallynecessary computation is only to compute several matrices. Theaforementioned processing of the state-action break-down unit 13 can besummarized as the following steps.

Step 1: compute the distribution parameters R_(i) and A_(j) of P(r,τ|s_(i)) and P(a|d_(j)) by using the equations (24) and (25), and P(r,τ, a|z_(m))=f(r, τ, a|Θ_(m)) using Θ obtained by the HMM parameterestimation unit 12. The computations are carried out for all (i, j) ofM×M ways.

Step 2: by using the parameters R_(i) and A_(j) found in step 1, and theformulas (26), (27) and (28), compute the probability λ_(m)(i, j) that,given a set of the customer state s_(i) and the action state d_(j), theset thereof belongs to the composite state z_(m). The computations arecarried out for all (i, j, m) of M×M×M ways.

Step 3: designate a desired time-interval in executing marketing actionsτ to be used for the MDP. Then, from the equation (30) using Q={qij}obtained by the HMM parameter estimation unit 12 and the parametersR_(i) and A_(j) found in step 1, compute the probability Pτ(s_(k)|s_(i),d_(j)) that the customer state s_(i) transits to the customer states_(k) when the time τ elapses after the action belonging to the actionstate d_(j) is taken on the customer in the customer state s_(i). Thecomputations are carried out for all (i, j, k) of M×M×M ways.

Step 4: assign the parameters found in step 1 and λ_(m)(i, j) found instep 2 to the equations (31) and (32), thereby computing the parameterΩ_(ij) of the distributions P(r, τ|s_(i), d_(j)) of probability that thereward r/inter-purchase time τ are observed when the action belonging tothe action state d_(j) is taken on a customer in the customer states_(i). The computations are carried out for all (i, j) of M×M ways.

Step 5: P_(τ)(s_(k)|s_(i), d_(j)) obtained in step 3 and the parametersΩ_(ij) found in step 4 are parameters applicable to the MDP. Moreover,the parameters R_(i) and A_(j) found in step 1 and λ_(m)(i, j) figuredout in step 2 are needed for assigning the actual purchase data to thecustomer state and the action state. Accordingly, store the parametersR_(i), A_(j), λ_(m)(i, j), P_(τ)(s_(k)|s_(i), d_(j)) and Ω_(ij).

As an implementation example of the state-action break-down unit 13, anexample of a case where (r, log₉₆ , a)^(T) is set so as to be normallydistributed. In this case, various integration computations can beanalytically solved in the foregoing steps. Here, in the equation

f(r,τ,a|θ _(m))=N(r, log τ,a;μ _(m), Σ_(m))  (33)

expressed separately are a component (having a subscript (s) attachedthereto) relating to (r, log_(τ)) of μ_(m) and Σ_(m), and a component(having a subscript (d) attached thereto) relating to a of μm and Σ_(m),as follows. Note that a subscript (sd) is attached to a part concerninga correlation between the two components.

$\begin{matrix}\text{[Formula~~28]} & \mspace{11mu} \\{\mu_{m} = \begin{pmatrix}\mu_{m}^{(s)} \\\mu_{m}^{(d)}\end{pmatrix}} & (34) \\{\sum\limits_{m}{= \begin{pmatrix}\overset{s}{\sum\limits_{m}} & \sum\limits_{m}^{({sd})} \\\left( \sum\limits_{m}^{({sd})} \right)^{T} & \overset{(d)}{\sum\limits_{m}}\end{pmatrix}}} & (35)\end{matrix}$

Firstly, P(r, τ|s_(i)) and P(a|d_(j)) can be respectively figured outfrom

$\begin{matrix}\text{[Formula~~29]} & \; \\{{P\left( {r,\left. \tau \middle| s_{i} \right.} \right)} = {N\left( {r,{{\log \; \tau};\mu_{i}^{(s)}},\overset{(s)}{\sum\limits_{i}}} \right)}} & (36) \\{{P\left( a \middle| d_{j} \right)} = {{N\left( {{a;\mu_{j}^{(d)}},\overset{(d)}{\sum\limits_{j}}} \right)}.}} & (37)\end{matrix}$

In order to determine λ_(m)(i, j), the Mahalanobis distance is computed,and

$\begin{matrix}\text{[Formula~~30]} & \mspace{11mu} \\{\left\lbrack {d\left( {p,q_{m}} \right)} \right\rbrack^{2} = {\left( {\mu_{ij} - \mu_{m}} \right)^{T}{\sum\limits_{m}^{- 1}\left( {\mu_{ij} - \mu_{m}} \right)}}} & (38)\end{matrix}$

is obtained, where

$\begin{matrix}\text{[Formula~~31]} & \; \\{\mu_{ij} = \begin{pmatrix}\mu_{i}^{(s)} \\\mu_{j}^{(d)}\end{pmatrix}} & (39) \\{\sum\limits_{ij}{= {\begin{pmatrix}\sum\limits_{i}^{(s)} & 0 \\0 & \sum\limits_{j}^{(d)}\end{pmatrix}.}}} & (40)\end{matrix}$

Hence, λ_(m)(i, j) is figured out from the following proportionalexpression (41).

$\begin{matrix}\text{[Formula~~32]} & \; \\{{{\lambda_{m}\left( {i,j} \right)} \propto \left\lbrack {{\left( {\mu_{ij} - \mu_{m}} \right)^{T}{\overset{- 1}{\sum\limits_{m}}\left( {\mu_{ij} - \mu_{m}} \right)}} + {{tr}\left( {\sum\limits_{m}^{- 1}\sum\limits_{ij}} \right)}} \right\rbrack^{- 1}},} & (41)\end{matrix}$

where Σ_(m)λ_(m)(i, j)=1.

Lastly, the equation (30) is directly used, and the equations (31) and(32) are rearranged as follows,

$\begin{matrix}\text{[Formula~~33]} & \; \\{{P\left( {r,\left. \tau \middle| z_{m} \right.,a} \right)} = {N\left( {r,{{\log \; \tau};{\mu_{m}^{(s)}(a)}},{\sum\limits_{m}^{(s)}(a)}} \right)}} & (42) \\{{\mu_{m}^{(s)}(a)} = {\mu_{m}^{(s)} + {\sum\limits_{m}^{({sd})}{\left( \overset{(d)}{\sum\limits_{m}} \right)^{- 1}\left( {a - \mu_{m}^{(d)}} \right)}}}} & (43) \\{{\overset{(s)}{\sum\limits_{m}}(a)} = {\overset{(s)}{\sum\limits_{m}}{- {\overset{({sd})}{\sum\limits_{m}}{\left( \sum\limits_{m}^{(d)} \right)^{- 1}{\left( \sum\limits_{m}^{({sd})} \right)^{T}.}}}}}} & (44)\end{matrix}$

As described above, there are two methods of finding P(r, τ|s_(i),d_(j)). In a case of using a mixed distribution, P (r, τ|s_(i), d_(j))is found as a contaminated normal distribution,

$\begin{matrix}\text{[Formula~~34]} & \; \\{{{P\left( {r,\left. \tau \middle| s_{i} \right.,d_{j}} \right)} = {\sum\limits_{m}{{\lambda_{m}\left( {i,j} \right)}{N\left( {r,{{\log \; \tau};{\mu_{m}^{(s)}\left( {i,j} \right)}},{\sum\limits_{m}^{(s)}\left( {i,j} \right)}} \right)}}}},} & (45)\end{matrix}$

where

$\begin{matrix}\text{[Formula~~35]} & \; \\{{\mu_{m}^{(s)}\left( {i,j} \right)} = {\mu_{m}^{(s)} + {\sum\limits_{m}^{({sd})}{\left( \sum\limits_{m}^{(d)} \right)^{- 1}\left( {\mu_{j}^{(d)} - \mu_{m}^{(d)}} \right)}}}} & (46) \\{{\overset{(s)}{\sum\limits_{m}}\left( {i,j} \right)} = {\overset{(s)}{\sum\limits_{m}}{- {\sum\limits_{m}^{({sd})}{\left( \overset{(d)}{\sum\limits_{m}} \right)^{- 1}{\left( \overset{({sd})}{\sum\limits_{m}} \right)^{T}.}}}}}} & (47)\end{matrix}$

In a case of mixing parameters in the parameter region, P(r, τ|s_(i),d_(j)) is found as an equation,

$\begin{matrix}\text{[Formula~~36]} & \; \\{{{P\left( {r,\left. \tau \middle| s_{i} \right.,d_{j}} \right)} = {N\left( {r,{{\log \; \tau};{\sum\limits_{m}{{\lambda_{m}\left( {i,j} \right)}{\mu_{m}^{(s)}\left( {i,j} \right)}}}},{\sum\limits_{m}{{\lambda_{m}\left( {i,j} \right)}{\overset{(s)}{\sum\limits_{m}}\left( {i,j} \right)}}}} \right)}},} & (48)\end{matrix}$

that is, a single normal distribution.

As an example of the present invention, descriptions will be given forexamples of GUIs provided by software to which the present invention isapplied. FIG. 9 shows an exemplar generation of feature vector timeseries data 23. The data on feature vector are generated from purchaserecords with timestamps and marketing action records that are differentfrom the purchase records. Table 90 on the upper-left side shows thepurchase records, Table 91 on the upper-right side shows the marketingaction records, and Table 92 on the lower side shows the generatedfeature vector time series data 23. In Table 90, stored are the salesamounts (dollars) of each of product groups of products having beenpurchased by the customer of a Customer ID=1 in chronological order. InTable 91, marketing actions that a company has taken on the customers ofCustomer IDs=1 to 5 are stored similarly in chronological order. As themarketing actions, Table 91 illustrates the setting of a discount rate,the providing of points and the providing of an option. In Table 92, thetimestamps are transformed into the inter-purchase times(Inter_purchase), and marketing action vectors are each allocated to acorresponding date (the next approximate date after an action is taken).Zero vectors are allocated to dates when no actions are taken. Since thepurchase data are huge in practice, such data are less likely to bedisplayed on a screen, and the processing is automatically carried out.

FIG. 10 is a screen displaying the parameters obtained by thestate-action break-down unit 13. FIG. 10 shows characteristics of acustomer state (here, referred to as a customer segment) named ‘FrequentBuyer.’ ‘Frequent Buyer’ is a name given here for convenience, and justindicates a selected one of the customer segments s₁ to s_(M), in fact.A rectangular area 101 on the left side of the screen displays variousinformation on the designated customer segment as information onprobability distributions computed using stored parameters. Theinformation displayed in this example is the information on thedistribution of inter-purchase times, the distribution of rewards andthe segment transition probabilities. FIG. 11 shows additionalinformation displayed on the screen of FIG. 10. This information isprovided as descriptions explaining tendencies of this customer statethat are deduced from the distribution characteristics. The descriptionscan be automatically created if appropriate rules are decided.

A rectangular area 102 written as ‘Specify action’ on the right side ofthe screen is a user's input area used for inputting an action vector ordesignating an action state. When a ‘Recalculate parameters’ button 103is pressed after desired values and the like are inputted, theinformation on the left and lower sides of the screen is updated. Thisupdate reflects changes in the obtained customer state, that is, thereward, the inter-purchase time and the customer segment transitionprobabilities, in response to marketing actions.

The aforementioned information can help a marketer to understand amarket. The marketer can especially observe changes in the customersegment transition probabilities in several different patterns byexperimentally changing the values of actions in the rectangular area102 on the right side of the screen. With this operation, the marketercan qualitatively understand what types of actions to be taken fornurturing more profitable customers. As a matter of course, in theultimate mathematical optimization, marketing actions to be recommendedare more precisely computed by solving a maximization problem of the MDPusing stored parameters.

[Hardware Configuration]

FIG. 12 is a diagram showing a hardware configuration of a customersegment estimation apparatus 10 according to an embodiment of thepresent invention. The general configuration will be described below asan information processing apparatus whose typical example is a computer.In a case of a dedicated apparatus or a built-in apparatus, however, arequired minimum configuration can be selected in response to itsinstallation environment, as a matter of course.

The customer segment estimation apparatus 10 includes a centralprocessing unit (CPU) 1010, a bus line 1005, a communication I/F 1040, amain memory 1050, a basic input output system (BIOS) 1060, a parallelport 1080, a USB port 1090, a graphic controller 1020, a VRAM 1024, asound processor 1030, an I/O controller 1070 and input means such as akeyboard and a mouse adapter 1100. A storage medium such as a flexibledisk (FD) drive 1072, a hard disk 1074, an optical disc drive 1076 or asemiconductor memory 1078 can be connected to the I/O controller 1070. Adisplay device 1022 is connected to the graphic controller 1020, and anamplifier circuit 1032 and a speaker 1034 are connected as options tothe sound processor 1030.

In the BIOS 1060, stored are programs such as a boot program executed bythe CPU 1010 at a startup time of the customer segment estimationapparatus 10 and a program depending on hardware of the customer segmentestimation apparatus 10. The FD (flexible disk) drive 1072 reads aprogram or data from a flexible disk 1071, and provides the read-outprogram or data to the main memory 1050 or the hard disk 1074 via theI/O controller 1070.

A DVD-ROM drive, a CD-ROM drive, a DVD-RAM drive or a CD-RAM drive canbe used as the optical disc drive 1076, for example. In this case, anoptical disc 1077 compliant with each of the drives needs to be used.The optical disc drive 1076 can read a program or data from the opticaldisc 1077, and can also provide the read-out program or data to the mainmemory 1050 or the hard disk 1074 via the I/O controller 1070.

A computer program provided to the customer segment estimation apparatus10 is stored in a storage medium such as the flexible disk 1071, theoptical disc 1077 or a memory card, and thus is provided by a user. Thiscomputer program is read from any of the storage media via the I/Ocontroller 1070, or downloaded via the communication I/F 1040. Then, thecomputer program is installed on the customer segment estimationapparatus 10, and then executed. An operation that the computer programcauses the information processing apparatus to execute is the same asthe operation in the foregoing apparatus, and the description thereof isomitted here.

The foregoing computer program may be stored in an external storagemedium. In addition to the flexible disk 1071, the optical disc 1077 orthe memory card, a magneto-optical storage medium such as an MD and atape medium can be used as the storage medium. Alternatively, thecomputer program may be provided to the customer segment estimationapparatus 10 via a communication line, by using, as a storage medium, astorage device such as a hard disk or an optical disc library providedin a server system connected to a private communication line or theInternet.

The foregoing example mainly explains of the customer segment estimationapparatus 10. However, it is possible to achieve the same functions asthose of the foregoing information processing apparatus by installing aprogram having the same functions on a computer, and then by causing thecomputer to operate as the information processing apparatus.Accordingly, the information processing apparatus described as anembodiment of the present invention can be constructed by using theforegoing method and a computer program of implementing the method.

The apparatus 10 of the present invention can be constructed byemploying hardware, software or a combination of hardware and software.In the case of the construction using a combination of hardware andsoftware, a typical example is the construction using a computer systemincluding a certain program. In this case, the certain program is loadedto the computer system and then executed, thereby the certain programcausing the computer system to execute processing according to thepresent invention. This program is composed of a group of instructionseach of which an arbitrary language, code or expression can express. Inaccordance with such a group of instructions, the system can directlyexecute specific functions, or can execute the specific functions aftereither/both (1) converting the language, code or expression into anotherone, or/and (2) copying the instructions to another medium. As a matterof course, the scope of the present invention includes not only such aprogram itself, but also a program product including a medium in whichsuch a program is stored. A program for implementing the functions ofthe present invention can be stored in an arbitrary computer readablemedium such as a flexible disk, an MO, a CD-ROM, a DVD, a hard diskdevice, a ROM, an MRAM and a RAM. In order to store the program in acomputer readable medium, the program can be downloaded from anothercomputer system connected to the system via a communication line, or canbe copied from another medium. Moreover, the program can be compressedto be stored in a single storage medium, or be divided into more thanone piece to be stored in more than one storage medium.

Although the embodiments of the present invention have been describedhereinabove, the present invention is not limited to the foregoingembodiments. Moreover, the effects described in the embodiments of thepresent invention are only enumerated examples of the most preferableeffects made by the present invention, and the effects of the presentinvention are not limited to those described in the embodiments orexamples of the present invention.

1. An apparatus for estimating a customer segment responding to amarketing action, comprising: an input unit for receiving customerpurchase data obtained by accumulating purchase records of a pluralityof customers, and marketing action data on actions taken on each of thecustomers; a feature vector generation unit for generating time seriesdata of a feature vector composed of a pair of the customer purchasedata and the marketing action data; an HMM parameter estimation unit foroutputting distribution parameters of a hidden Markov model based on thetime series data of the feature vector and the number of customersegments, for each composite state composed of a customer stateclassified by customer purchase characteristic and an action stateclassified by the effects of a marketing action; and a state-actionbreak-down unit for transforming the distribution parameters intoparameter information for each customer segment.
 2. The apparatusaccording to claim 1, wherein the customer purchase data contain anidentification number of a customer, a purchase date of the customer anda vector of a transaction made by the customer at that purchase date. 3.The apparatus according to claim 1, wherein the time series data of thefeature vector are vector data in which information containingsales/profits produced in each purchase transaction and aninter-purchase time are associated as a pair with a marketing actionrelated to the purchase transaction.
 4. The apparatus according to claim1, wherein the marketing action data contain the customer numbertargeted by a market action, a purchase date estimated as when thecustomer makes a purchase possibly because of an effect of the marketaction, and a vector of a marketing action taken at the purchase date.5. The apparatus according to claim 1, wherein the distributionparameters include probability distributions of sales/profits,inter-purchase times and marketing actions, which differ among compositestates, and transition rates of continuous-time Markov processesindicating transitions from a composite state to other composite states.6. The apparatus according to claim 1, wherein the parameter informationfor each customer segment contains transition probabilities from acustomer state to other customer states, and a short-term reward.
 7. Theapparatus according to claim 1, wherein the state-action break-down unitreceives a time interval determined for marketing actions as an input.8. A method of estimating a customer segment responding to a marketingaction; comprising the steps of: receiving customer purchase dataobtained by accumulating purchase records of a plurality of customers,and marketing action data on actions taken on each of the customers;generating time series data of a feature vector composed of a pair ofthe customer purchase data and the marketing action data; outputtingdistribution parameters of a hidden Markov model based on the timeseries data of the feature vector and the number of customer segments,for each composite state composed of a customer state classified bycustomer purchase characteristic and an action state classified byeffect of a marketing action; and transforming the distributionparameters into parameter information for each customer segment.
 9. Acomputer program for estimating a customer segment responding to amarketing action, causing a computer to execute the steps of: receivingcustomer purchase data obtained by accumulating purchase records of aplurality of customers, and marketing action data on actions taken oneach of the customers; generating time series data of a feature vectorcomposed of a pair of the customer purchase data and the marketingaction data; outputting distribution parameters of a hidden Markov modelbased on the time series data of the feature vector and the number ofcustomer segments, for each composite state composed of a customer stateclassified by customer purchase characteristic and an action stateclassified by effect of a marketing action; and transforming thedistribution parameters into parameter information for each customersegment.